Medicare Presentation

Cost of Medical Care

Medical care in the United States cost Americans an average of approximately 8,500 per person per year, the equivalent of 18% of US GDP annually. This is by far the most of any country in the world, both as a dollar ammount and as a percentage of GDP. Most of this cost comes from emergency medical care and scheduled care at hospital facilities.

RProject

Although our health care purchases are arguably the most important ones we will make as consumers, there is remarkably little information on how much different medical procedures cost and what we are receiving in exchange for our money. Supposedly higher quality of care is more expensive, but is this actually true? We used R and Tableau to explore these disparities in the cost of care and their relationship to care quality.

source("RPlots/ConnectToOracle.R", echo = TRUE)
## 
## > library(shiny)
## 
## > library(shinyapps)
## 
## Attaching package: 'shinyapps'
## 
## The following object is masked from 'package:shiny':
## 
##     hr
## 
## > library(rJava)
## 
## > library(DBI)
## 
## > library(RJDBC)
## 
## > library(ggplot2)
## 
## > options(java.parameters = "-Xmx2g")
## 
## > jdbcDriver <- JDBC(driverClass = "oracle.jdbc.OracleDriver", 
## +     classPath = "~/ojdbc7.jar")
## 
## > con <- dbConnect(jdbcDriver, "jdbc:oracle:thin:@128.83.138.158:1521/pdborcl", 
## +     "ds_medicare", "orcl")

Queries

We ran these querries to Import Data into R from the ds_medicare database.

source("RPlots/SQLQuerries.R", echo = TRUE)
## 
## > HCAHPSMeasure = dbGetQuery(con, "Select * From Measures")
## 
## > InpatientServices <- dbGetQuery(con, "Select * from InpatientServices")
## 
## > Providers = dbGetQuery(con, "Select * from Providers")
## 
## > OutpatientServices <- dbGetQuery(con, "Select * from OutpatientServices")
## 
## > OutpatientVisits <- dbGetQuery(con, "select * from Outpatient WHERE ID BETWEEN 0 and 30000")
## 
## > OutpatientVisits = rbind(OutpatientVisits, dbGetQuery(con, 
## +     "select * from Outpatient WHERE ID BETWEEN 30001 and 60000"))
## 
## > OutpatientVisits = rbind(OutpatientVisits, dbGetQuery(con, 
## +     "select * from Outpatient WHERE ID BETWEEN 60001 and 90000"))
## 
## > OutpatientVisits = rbind(OutpatientVisits, dbGetQuery(con, 
## +     "select * from Outpatient WHERE ID BETWEEN 90001 and 100000"))
## 
## > InpatientVisits <- dbGetQuery(con, "select * from Inpatient WHERE ID BETWEEN 0 and 30000")
## 
## > InpatientVisits = rbind(InpatientVisits, dbGetQuery(con, 
## +     "select * from Inpatient WHERE ID BETWEEN 30001 and 60000"))
## 
## > InpatientVisits = rbind(InpatientVisits, dbGetQuery(con, 
## +     "select * from Inpatient WHERE ID BETWEEN 60001 and 90000"))
## 
## > InpatientVisits = rbind(InpatientVisits, dbGetQuery(con, 
## +     "select * from Inpatient WHERE ID BETWEEN 90001 and 100000"))
## 
## > InpatientVisits = rbind(InpatientVisits, dbGetQuery(con, 
## +     "select * from Inpatient WHERE ID BETWEEN 100001 and 130000"))
## 
## > InpatientVisits = rbind(InpatientVisits, dbGetQuery(con, 
## +     "select * from Inpatient WHERE ID BETWEEN 130001 and 160000"))
## 
## > outpatientCostByCity = dbGetQuery(con, "SELECT Providers.City as City, AVG(OutPatient.AverageSubmittedCharges) as AvgBilledCost \n                   ..." ... [TRUNCATED] 
## 
## > outpatientCostByState = dbGetQuery(con, "SELECT Providers.State as State, AVG(OutPatient.AverageSubmittedCharges) as AvgBilledCost \n                ..." ... [TRUNCATED] 
## 
## > outpatientCostByHospital = dbGetQuery(con, "\n                                      SELECT Providers.Name as Hospital, AVG(OutPatient.AverageSubmitt ..." ... [TRUNCATED] 
## 
## > outpatientCostByCity = dbGetQuery(con, "SELECT Providers.City as City, AVG(OutPatient.AverageSubmittedCharges) as AvgBilledCost \n                   ..." ... [TRUNCATED] 
## 
## > outpatientCostByState = dbGetQuery(con, "SELECT Providers.State as State, AVG(OutPatient.AverageSubmittedCharges) as AvgBilledCost \n                ..." ... [TRUNCATED] 
## 
## > outpatientCostByHospital = dbGetQuery(con, "SELECT Providers.Name as Hospital, AVG(OutPatient.AverageSubmittedCharges) as AvgBilledCost \n           ..." ... [TRUNCATED] 
## 
## > InpatientCostByCity = dbGetQuery(con, "SELECT Providers.City as City, AVG(InPatient.CoveredCharges) as AvgBilledCost \n                              ..." ... [TRUNCATED] 
## 
## > InpatientCostByState = dbGetQuery(con, "SELECT Providers.State as State, AVG(Inpatient.CoveredCharges) as AvgBilledCost \n                           ..." ... [TRUNCATED] 
## 
## > InpatientCostByHospital = dbGetQuery(con, "\n                                     SELECT Providers.Name as Hospital, AVG(InPatient.CoveredCharges) a ..." ... [TRUNCATED] 
## 
## > PatientsRated9or10 = dbGetQuery(con, "\n                                Select Providers.Name, Reviews.AnswerPercent FROM Providers\n                ..." ... [TRUNCATED] 
## 
## > PatientsRated9or10$ANSWERPERCENT <- as.numeric(PatientsRated9or10$ANSWERPERCENT)
## 
## > CostVSRating = dbGetQuery(con, "\n                          Select Reviews.Answerpercent AS Rating, Reviews.SurveyID AS Question, \n                 ..." ... [TRUNCATED] 
## 
## > Rated9or10 = subset(CostVSRating, QUESTION == "H_HSP_RATING_9_10")
## 
## > Rated7or8 = subset(CostVSRating, QUESTION == "H_HSP_RATING_7_8")
## 
## > Rated0to6 = subset(CostVSRating, QUESTION == "H_HSP_RATING_0_6")
## 
## > DefinitelyRecommend = subset(CostVSRating, QUESTION == 
## +     "H_RECMND_DY")
## 
## > ProbablyRecommend = subset(CostVSRating, QUESTION == 
## +     "H_RECMND_PY")
## 
## > NotRecommend = subset(CostVSRating, QUESTION == "H_RECMND_DN")
## 
## > TexasQuery = subset(CostVSRating, STATE == "TX")
## 
## > AustinQuery = subset(TexasQuery, REGION == "TX - Austin")
## 
## > AverageCostBy910Rating <- aggregate(cbind(COST, INSUREDCOST) ~ 
## +     PROCEDURE, Rated9or10, mean)
## 
## > CheaperOutpatient = subset(AverageCostBy910Rating, 
## +     COST < 2000)
## 
## > InpatientVisits$TOTALPAYMENTS <- as.numeric(InpatientVisits$TOTALPAYMENTS)
## 
## > p <- subset(OutpatientVisits, APCID == 12)
## 
## > p <- mean(p$AVERAGESUBMITTEDCHARGES)
## 
## > TexasCostByProcedure <- aggregate(INSUREDCOST ~ PROCEDURE, 
## +     TexasQuery, mean)

Hospital Procedure Costs

source("RPlots/Plots2.R", echo = TRUE)
## 
## > p3 <- hist(InpatientVisits$TOTALPAYMENTS, main = "Inpatient Procedure Cost", 
## +     xlab = "Average Ammount Billed Per Procedure", ylab = "# of Hosp ..." ... [TRUNCATED]

plot of chunk unnamed-chunk-3

## 
## > p4 <- hist(OutpatientVisits$AVERAGESUBMITTEDCHARGES, 
## +     main = "Outpatient Procedure Cost", xlab = "Average Amount Billed Per Procedure", 
## +     .... [TRUNCATED]

plot of chunk unnamed-chunk-3

## 
## > p5 <- hist(PatientsRated9or10$ANSWERPERCENT, main = "Patient Satisfaction \nRatings", 
## +     xlab = "Percent of Patients Who Rated \n Their Hospital ..." ... [TRUNCATED]

plot of chunk unnamed-chunk-3

## 
## > p1 <- ggplot(InpatientCostByState, aes(x = STATE, 
## +     y = AVGBILLEDCOST)) + geom_point() + coord_flip()
## 
## > p1

plot of chunk unnamed-chunk-3

## 
## > p2 <- ggplot(outpatientCostByState, aes(x = STATE, 
## +     y = AVGBILLEDCOST)) + geom_point() + coord_flip()
## 
## > p2

plot of chunk unnamed-chunk-3

## 
## > p7 <- ggplot(Rated9or10, aes(x = RATING, y = COST)) + 
## +     geom_point() + facet_wrap(~PROCEDURE)
## 
## > p7
## Warning: Removed 10 rows containing missing values (geom_point).
## Warning: Removed 14 rows containing missing values (geom_point).
## Warning: Removed 11 rows containing missing values (geom_point).
## Warning: Removed 10 rows containing missing values (geom_point).
## Warning: Removed 4 rows containing missing values (geom_point).
## Warning: Removed 6 rows containing missing values (geom_point).
## Warning: Removed 2 rows containing missing values (geom_point).
## Warning: Removed 3 rows containing missing values (geom_point).
## Warning: Removed 7 rows containing missing values (geom_point).
## Warning: Removed 7 rows containing missing values (geom_point).
## Warning: Removed 10 rows containing missing values (geom_point).
## Warning: Removed 1 rows containing missing values (geom_point).
## Warning: Removed 3 rows containing missing values (geom_point).
## Warning: Removed 2 rows containing missing values (geom_point).
## Warning: Removed 2 rows containing missing values (geom_point).
## Warning: Removed 3 rows containing missing values (geom_point).
## Warning: Removed 8 rows containing missing values (geom_point).
## Warning: Removed 14 rows containing missing values (geom_point).
## Warning: Removed 1 rows containing missing values (geom_point).
## Warning: Removed 4 rows containing missing values (geom_point).
## Warning: Removed 2 rows containing missing values (geom_point).
## Warning: Removed 13 rows containing missing values (geom_point).

plot of chunk unnamed-chunk-3

## 
## > p8 <- ggplot(Rated9or10, aes(x = RATING, y = COST)) + 
## +     geom_point() + facet_wrap(~STATE)
## 
## > p8
## Warning: Removed 10 rows containing missing values (geom_point).
## Warning: Removed 21 rows containing missing values (geom_point).
## Warning: Removed 16 rows containing missing values (geom_point).
## Warning: Removed 7 rows containing missing values (geom_point).
## Warning: Removed 2 rows containing missing values (geom_point).
## Warning: Removed 6 rows containing missing values (geom_point).
## Warning: Removed 36 rows containing missing values (geom_point).
## Warning: Removed 28 rows containing missing values (geom_point).
## Warning: Removed 11 rows containing missing values (geom_point).

plot of chunk unnamed-chunk-3

## 
## > p10 <- ggplot(TexasCostByProcedure, aes(x = PROCEDURE, 
## +     y = INSUREDCOST)) + geom_point() + coord_flip()
## 
## > p10

plot of chunk unnamed-chunk-3

## 
## > p11 <- ggplot(TexasQuery, aes(x = RATING, y = COST)) + 
## +     geom_point() + facet_wrap(~PROCEDURE)
## 
## > p11
## Warning: Removed 58 rows containing missing values (geom_point).
## Warning: Removed 58 rows containing missing values (geom_point).
## Warning: Removed 58 rows containing missing values (geom_point).
## Warning: Removed 87 rows containing missing values (geom_point).
## Warning: Removed 58 rows containing missing values (geom_point).

plot of chunk unnamed-chunk-3

## 
## > p12 <- ggplot(AustinQuery, aes(x = RATING, y = COST)) + 
## +     geom_point() + facet_wrap(~PROCEDURE)
## 
## > p12

plot of chunk unnamed-chunk-3

Health Care Costs: Closer to Home

Even within Austin, there is a wide disparity in the cost of care, shown here for both insured and uninsured patients and by procedure:

Hospital Cost and Patient Volume

We did find that higher cost was somewhat positively correlated with hospital volume (the number of patients seen):

And slightly more so nationally:

Hosptial Cost and Patient Quality

And although we found that higher cost also had a slight correlation with hospital quality locally:

At National Level, no correlation between hospital quality and cost:

…that correlation dropped off when the view was expanded to the national level:

So other than patient volume, is there anything else we can point to to explain such wide disparities in cost as this?

And we can see by breaking the data down by procedure that that different procedure specializations between hospitals also aren’t the driving factor in cost disparity. In fact, the only thing positively cooralated here is the cost for insured patients to the cost for uninsured patients.

Factors Affecting Medicare Costs and Copay

To find other possible correlations with high cost, we wanted to see if there were any states that stood out with higher costs in both Medicare reinbursements and patient copay costs and see if we could find any common factors among these high-cost states.

Clustering on Medicare Costs and Copay

Medicare Payments (highest cluster)

WY, SD, OR, NY, NJ, MA, ID, HI, DE, DC, CT, CA, AZ, AK

Copay (highest cluster)

WY, UT, SD, NV, ID, HI, AK

Combined Cluster (highest cluster)

WY, UT, SD, NV, ID, HI, AK

Total Cost

Intersection of the highest Medicare and Copay clusters yielded 4 states: Wyoming, Alaska, South Dakota, and Idaho. These states with high costs for insured patients tended towards voting Democratic and have low populations.

Our most extreme outlier as well as the only Republican state, Wyoming, also had the fewest Medicare procedures per person and the highest number of physicians per person.

Cost and Health Care Access

We also found that, possibly because of the high cost of medical care, there were a large number of people forgoing medical care because of cost.

Cost and Health Care Access

And that the people who needed health care the most were the least likely to receive it.

Cost and Health Care Access

They also tend to be poor, uninsured, and have poor mental health.

Medicare and Health Care Access

This was less true for those 65 and older, however, who were the least likely age group to forgo medical care because of cost.

Medicare and Health Care Access

Being over 62 was also a factor for falling into the group we found was most likely to receive care at 98.81%

Medicare and Health Care Access

However, those uninsured and 65 and over fell into the group of people least likely to receive care. Node 85 was the least likely group to receive care, at 82.65%.

Medicare and Health Care Access

Purely in terms of access and total cost, then, we can say that the Medicare program has been extremely successful in ensuring access to health care for the vulnerable population of seniors.

Case Study: Depression

In our research, we also wanted to identify factors with high correlation to the existence of depression or other mental health disorders.

Dataset

The dataset which was used included answers from a phone survey that took place in 2011-2012. Most questions related to behaviors resulting in leading causes of premature mortality and morbidity among adults. Over 139,000 responses were collected to 177 such questions.

Selected Attributes

18 attributes were chosen and extracted from the dataset as the basis of our examination in order to determine which factors have the highest correlation to mental disorders or depression. After all null values were eliminated from our selection, we had a remaining 108.544 rows.

Attribute Meaning
ADDEPEV2 Ever told you had a depressive disorder?
AGE Age
EDUCAG Level of education (1-4)
EMPLOY Employment Status (1-5)
EXERANY1 Exercise in past 30 days
GENHLTH Would you say that in general your health is? (1-5; Excellent-Poor)
HLTHPLN1 Do you have any kind of health coverage?
INCOME2 Income Level (1-8)
INCOMG Computed Income Categories (1-5)
MEDCOST Was there a time in the past 12 months, when you could not see a doctor because of the cost?
RFBING5 Binge Drinking Calculated Variable
RFBMI5 Overweight or obese calculated variable (BMI > 25.0)
RFDRHV4 Heavy Alcohol Consumption Calculated Variable
RFHLTH Adults with good or better health
RFSMOK3 Adults who are currently smokers
SEX Sex
VETERAN3 Are you a veteran?

Oracle Data Miner Workflow

Filtering Null Values

Explore Data Node on Filtered Values

Depression Clustering

Feature Build Node

NMF Feature Build Coefficients

Each feature created by NMF is a linear combination of the original attribute set. Each feature has a set of coefficients, which are a measure of the weight of each attribute on the feature. There is a separate coefficient for each numerical attribute and for each distinct value of each categorical attribute. The coefficients are all non-negative.

Plots and Graphs